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Abstract 

We present a method to quantify uncertainty in the predictions made by simulations 
of mathematical models that can be applied to a broad class of stochastic, discrete, 
and differential equation models. Quantifying uncertainty is crucial for determining 
how accurate the model predictions are and identifying which input parameters affect 
the outputs of interest. Most of the existing methods for uncertainty quantification 
require many samples to generate accurate results, are unable to differentiate where the 
uncertainty is coming from (e.g., parameters or model assumptions), or require a lot of 
computational resources. Our approach addresses these challenges and opportunities 
by allowing different types of uncertainty, that is, uncertainty in input parameters 
as well as uncertainty created through stochastic model components. This is done 
by combining the Karhunen-Loeve decomposition, polynomial chaos expansion, and 
Bayesian Gaussian process regression to create a statistical surrogate for the stochastic 
model. The surrogate separates the analysis of variation arising through stochastic 
simulation and variation arising through uncertainty in the model parameterization. 

We illustrate our approach by quantifying the uncertainty in a stochastic ordinary 
differential equation epidemic model. Specifically, we estimate four quantities of interest 
for the epidemic model and show agreement between the surrogate and the actual model 
results. 

Keywords: Surrogate model, statistical emulation, uncertainty quantification, stochastic 
epidemic model, Gaussian process model, polynomial chaos, intrinsic uncertainty, paramet¬ 
ric uncertainty 


1 Introduction 

The uncertainty created by the stochastic processes and approximate parameters in mathe¬ 
matical models must be quantified to assess the reliability of the model predictions. As the 
complexity of models increases to include more detail, so does the number of parameters 
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that must be estimated. A sophisticated framework is required to quantifying the uncer¬ 
tainty created by nonlinear interactions between parameters and stochastic forces from a 
limited number of computational experiments. We will describe an approach for uncertainty 
quantification (UQ) for the predicted output of a simulation, referred to as quantities of 
interest (QOI). 

If computer time is not an issue, then information about the predicted distribution of 
QOI can be extracted by a traditional Monte Carlo approach [18]. In traditional Monte 
Carlo, a comprehensive set of simulations is created by sampling the model parameters 
according to their a priori distributions. If there are stochastic processes, these simulations 
are repeated for each parameter value to sample the variation in QOI created by the intrinsic 
stochasticity. The distribution of the QOI can then be reconstructed using standard kernel 
density methods [B]. However, in large-scale simulations, this type of Monte Carlo approach 
is prohibitively expensive. In this case, an emulator or statistical surrogate model can be 
used to characterize the QOIs over a range of possible model parameters [tuisiutiedj. 
Sufficient samples are generated until this statistical process faithfully reproduces the ob¬ 
served correlations and can account for uncertainty due to finite sampling of the simulation. 
These processes are then used as surrogate models to quantify the model uncertainty and 
the correlations of the QOIs to the input parameter values. 

We refer to uncertainty (variation) in QOI due to imprecisely known input parameters as 
parametric or epistemic uncertainty |6j. With parametric uncertainty, we do not know the 
specific model parametrization needed in our simulation to make accurate predictions. It 
is assumed that a believable range of values for a parameter and the probability associated 
with a value in that range is known, i.e., we know the probability density function (pdf) of 
the input parameter(s). Examples of parametric uncertainty abound; in epidemiology, the 
mean and variance of recovery rates for diseases are typically determined experimentally, 
whereas transmission rates are typically obtained from observing epidemic progression in 
a population. Parametric uncertainty represents uncertainty in QOI due to imprecisely 
known input parameters to the simulation. 

In addition to parametric uncertainty, some models’ predictions rely on the outcome of 
random events, which create uncertainty in the model predictions, even if input parameters 
are fixed. We refer to these stochastic variations in model QOI as intrinsic or aleatory 
uncertainty [6j. This type of uncertainty is observed in epidemic models when the number 
of individuals becoming infected on a particular day is a random event and a function of the 
stochastic nature of a communities’ contact network structure mm- Intrinsic uncertainty 
represents variation in QOI that is present even when input parameters for the simulation 
are fixed. 

The two types of uncertainty can be closely connected. For a specific example, the mean 
and variance of the distribution for the time it takes a person to recover from a disease may 
be specified as inputs to an epidemic simulation but may only be known imprecisely. If the 
imprecise knowledge is specified by a known probability density function we would label 
this mean and variance as a source of parametric uncertainty. Once the mean and variance 
are fixed, however, each simulation of a stochastic epidemic model will result in a distinct 
epidemic realization. The variation in these realizations, with the input parameters fixed, 
is labeled as a source of intrinsic uncertainty mm- 
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Our UQ approach combines multiple methods of statistical surrogate modeling to allow 
separation of parametric and intrinsic uncertainty. We show how to construct a statisti¬ 
cal emulator that distinguishes between the two types of uncertainty and can account for 
situations where the intrinsic uncertainty is non-Gaussian. In the presence of intrinsic un¬ 
certainty, the simulations are sampled randomly for each fixed input parameter. A kernel 
density approach [8] is used to estimate the distribution of QOI for each fixed parameter¬ 
ization, and the contribution to the variation from intrinsic sources is separated using a 
non-intrusive polynomial chaos method m • The inclusion of the polynomial chaos decom¬ 
position allows our method to account for non-Gaussianity in the intrinsic variation of the 
QOI. Once the polynomial chaos decomposition has been performed, the contribution to 
variation from parametric uncertainty can be analyzed separately using a Gaussian process 
emulator mmm- 

Since the emulator adds additional uncertainty when it interpolates QOI in regions 
where there are few samples, the surrogate model constructed here has a variance that 
increases at parameter values far from samples of the simulation. We will describe the 
approach for a situation where the QOIs can be approximated by a unimodal distribution. 
If this condition is not satisfied, then clustering methods can be used to reduce the QOI 
distribution to several unimodal random variables and apply the emulator to each one 
separately [20]. In this work, we eliminate the multi-modality of a simulations’ output in a 
pre-processing step by considering as output, simulation results that fall close to one of the 
modes in the distribution of simulation predictions. This step can be thought of as studying 
the random variable that is the QOI from the simulation, conditioned on the event that 
the prediction is near a particular mode. Each mode can then be studied as a separate 
conditional response of the simulation. 

Although our approach uses fewer samples than a standard Monte Carlo method, it can 
still require significant computational resources to construct the surrogate model, especially 
if there are many parameters with intrinsic or parametric uncertainties. The emulator itself 
is an approximation and samples from the emulation will not exactly reproduce the distri¬ 
butions of the models of QOI. Although the mean and variance of the emulated probability 
distributions may converge rather quickly, the higher order moments can require a large 
sample size. As more samples of the simulation are included in the emulator construction, 
the emulation will behave more like the actual simulated model, though the exact rate of 
convergence remains an open question. 

In the next section, we introduce the stochastic epidemic simulation that we use as an 
example throughout the paper. We describe how to account for intrinsic uncertainty using 
the Karhunen-Loeve decomposition and a non-intrusive polynomial chaos decomposition. 
This is followed by our implementation of Gaussian process regression to model the effect 
of parametric uncertainty and finite sample size. 


2 The stochastic epidemic model 

Throughout the description of our surrogate modeling methodology, we will keep in mind 
a particular example coming from epidemic modeling mm- This work was motivated by 
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the lack of approaches to quantify uncertainty for large scale agent-based epidemic models 
0H3I- Our emulation method is general enough to be applied to any mathematical model 
simulation but due to our original motivation, it will be demonstrated using a stochastic 
epidemic model. The model is a system of stochastic differential equations (SDE). Each 
term in the SDE, represents the number of individuals in a particular disease state, i.e., an 
individual is susceptible to the disease, infected with the disease and can infect others, or 
recovered from the disease and has immunity. In the epidemic modeling literature, this is 
known as the classical Susceptible-Infected-Recovered (SIR) disease model [3j. 

The differential equations for our SIR model are briefly outlined here, for further refer¬ 
ence and a complete derivation see DQEl. The constant size of the population is denoted 
N. The number of individuals in each category at time t > 0 is denoted as S{t), Z(t), 
lZ(t) for susceptible, infected, and recovered, respectively. Since the total population is 
constant, the number of recovered individuals satisfies 7 Z(t) = N — S(t) — Z(t). Letting 
Z(t) = (5(t),Z(f)) T and defining the time dependent mean and covariance 


A(Z) 


-jrSZ \ ( &S1 SI 

, V(Z) = 

£sz- 7 zj \-&SZ jfSZ + 7 Z 


we can express the time evolution of Z(t) by the Ito SDE [9] 


dZ(t) = A{Z)dt + B(Z)dW, Z{ 0 ) = ( S 0 ,Z 0 ) T . 


( 2 . 1 ) 


( 2 . 2 ) 


Here B(Z) = \JV (Z), V(Z) is symmetric and positive definite, and W = (Wi,VV , 2 ) t is 
a vector of independent Wiener processes [9j. The parameter f3 > 0 is the infection rate, 
representing the rate at which individuals in the population become infected. The parameter 
7 > 0 is the recovery rate, representing the rate at which an infected individual is removed 
from the population. In this model, once an individual recovers, they are considered either 
dead or immune, and are removed from the model completely. 

In our example, for a given simulation, we analyze a population of N = 10,000 indi¬ 
viduals with initial conditions Z( 0) = (9998, 2) T . When a recovery rate and infection rate 
(/3,y) are fixed and a simulation is run, we record the following four QOI: 


(5i(/?,7;w) := -Pinf = Maximum % population simultaneously infected 
Q2((3,T, oj) := T p = Time to the peak, p n f, in days ^ 

:= T ( i = Duration, number of days % infected is within 50% of peak 
Q4(/3,7; w) := Cinf = Cumulative % ever infected 


produced by the simulation. These four QOI are depicted for one sample of the stochastic 
SIR simulation in Figure [l] The state variable u indicates random variation due to the 
stochastic effects in the SDE. After the simulated data is generated, we restrict ourselves 
to studying only the simulations where at least 10% of the population becomes infected. 
This will have the effect of making the joint distribution of the QOI, (Qi, Q2, Q3, Qa), 
approximately unimodal, which is a necessary assumption for our emulation method. The 
vector output of the simulation will be denoted by 

Q(P: 7! w ) = (Qi) Q 2 , Q 3 , Qa) T ■ (2.4) 
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The transmission and recovery rate parameters, (/3, 7 ), are our source of parametric 
uncertainty. In practice, these are determined from experimental data or observations of 
disease spread in similar populations. They are only approximately determined and we 
model their uncertainty by treating them as random variables. We assume that (/?, 7 ) have 
lognormal distributions, 


P ~ In A f{np, <Jp), 7 ~ lnAf 0 7 , a: 


7/) 


(2.5) 


with fxp = 1, Op = 0.000125, ^ = 0.8, cr% = 0.000125. These distributions and their 
corresponding univariate histograms of the samples are plotted in Figure [2] Besides uncer¬ 
tainty in the QOI caused by imprecisely known transmission and recovery rates, we must 
also account for the variation introduced by the stochasticity in the SIR model, an intrinsic 
uncertainty. When sampling Q(/3, 7 ; oj) to quantify the model’s uncertainty, we first draw 
(/3, 7 ) samples from the distributions in Equation (2.5). For each of these parameter sets, 


we then simulate multiple solutions to Equation (2.2) and record the samples of Q(/ 3 , 7 ;cj 


Repeating the simulation multiple times for each parameter set is essential to sample the 
contribution of intrinsic uncertainty to the distribution of the QOI. Once this is done, we 
use kernel density estimation [ 8 ] to form an approximate distribution of Q(/3, 750 ;). The 
result of this process is shown in Figure [3| 




Figure 1: A sample of Q(/J, 750 ;) from one SIR realization. (LEFT) % infected time series for a fixed (/?, 7 ). 
Q 1 , Q 2 , and Q 3 are labeled. (RIGHT) % cumulatively infected time series for a fixed (/3, 7 ), Q 4 is marked. 


Notation: To ensure generality in our presentation and to avoid overly cumbersome no¬ 
tation, we denote X(9‘,u) = (X T (9',uj)) d =1 € M. d to represent our vector quantity of interest. 
Here, r is a discrete index parameter indicating the specific QOI. In practice, r could also 
represent discrete samples of a continuous parameter, e.g., time. The vector 9 E MP will 
denote input parameters to the simulation and oj is the state variable controlling stochastic 
variation. 

For the simulation, output X t (9;oj) uncertainty due to variation in 9 will be parametric 
uncertainty. We will assume that we know the probability density for 9 and can choose 
how we will sample from that density when the simulation is run. The uncertainty in our 
output due to oj will be referred to as intrinsic uncertainty , which is characterized by the 
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Figure 2: Samples of (/3, 7 ) parameters from their log-normal distributions and corresponding univariate 
histograms of the samples. These samples are then used as input parameters to (2.2 1 , effectively exploring 
the contribution of parametric uncertainty to the distribution of 


absence of a parameterized probability space and by our inability to choose how to sample 
its effects. 

For the SIR model, the notational translation will be X t (8\uj) = Q T (f3, 7 ;u;) with r = 
1, 2, 3,4 and 8 = (/3, 7 ). 


3 Karhunen-Loeve decomposition 

In this section, we use the Karhunen-Loeve (KL) decomposition to transform the QOI to 
a set of uncorrelated random variables. The lack of correlation between the transformed 
QOI will aid in our construction of a polynomial chaos representation of our model. Using 
the KL decomposition has the added benefit of possibly reducing the dimension of the QOI 
space being emulated. 

We first decompose X r (8;u) so that the contributions from r and 8 are separated from 
those of ui. Define 

X = E[X(8-uj)\ 

so the process may be represented as 

X(6;u) =X + X°{8;u). (3.1) 

The zero mean random vector X°(8-,uj) is now emulated. 

In order to remove the correlations between the QOI and reduce dimensionality, we use 
a Karhunen-Loeve (KL) |10l 111] decomposition for X°. The covariance function is given by 
the d x d square matrix 

I<(s,r) = E[X° s (8-u J )X 0 r (8- 1 u)}, s, r = 1,2,..., d. (3.2) 
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Figure 3: One dimensional and two dimensional marginal distributions from the four dimension distribution 
corresponding to Q(/3,7; tu). Kernel density estimation was used to generate each of the marginal distri¬ 
butions from samples of the SIR model. The goal of the methods detailed in this paper is to be able to 
reconstruct an approximation of this distribution from very few samples of the actual SIR simulation and 
to be able to directly generate approximate samples of Q(/3,7;cj) quickly. 


In the KL decomposition, one then finds the eigenfunctions of the covariance by solving the 
eigenvalue problem 

d 

J2 r )/" = An/ T n , T, n = 1, 2,..., d. (3.3) 

r=l 

The set of Euclidean unit length eigenvectors {/ n }n=i C forms a basis for the random 
vector X°(6-,u) so we may project onto this basis. Coefficients of the projection are 


= (X°(e-,u),n = ^X 0 T (6-,Lo)f T 


(3.4) 


T— 1 

The KL decomposition of the zero mean random vector is 

d 


x\e-u) = Y J Uo-^)f r 


(3.5) 


n= 1 


and the truncated KL decomposition is obtained by taking the first N < d terms in the 
series 


N 




(3.6) 


71=1 


With Equation (3.6), we have reduced the emulation problem to the problem of emulat¬ 
ing the random vector of uncorrelated coefficients £(#;<*;) = (£i(0;w),..., £/v(0; u;)). Since 
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we are not assuming that X(8;ca) is Gaussian the coefficients, £i(0;w), are not necessarily 
Gaussian and therefore not necessarily independent. 


Correlations between entries of X(8;cu), corresponding to different QOI, is now con¬ 
trolled by the entries of the eigenvectors f n . If the eigenvalues in Equation (3.3) decay 
rapidly, the number of terms, N, can be taken to be much smaller than the original number 
of QOI, effectively reducing the output dimension. 


KL Coefficients 



Figure 4: Starting with the four dimensional distribution of Q(f3 ,7;w), shown in Figure [HJ the KL decom¬ 
position is computed. In this figure, we show the one and two dimensional marginal distributions for the 
first three KL coefficients derived from the distribution of Q(/?, 7 ;w). When compared to the marginal dis¬ 
tributions in Figure [3] one can see that the correlations between the KL coefficients is less than those of the 
original QOI. This is important for implementation of the polynomial chaos decomposition. 

The effectiveness of the KL decomposition on our example problem is shown in Figures 
[ 4 ] and [HJ Since our SIR model output, Q(/3,7;w), is only four dimensional, a complete 
KL decomposition has four coefficients. In Figure |4j the distribution of the first three KL 
coefficients is visualized. These coefficients are used to reconstruct an approximation to 
the distribution of Q(/3, 7 ; ui) shown in Figure [ 3 } The approximate reconstruction has a 
distribution depicted in Figure [5] 

4 Polynomial chaos expansion of coefficients 

A reduced representation for the random vector of uncorrelated, but not independent, ran¬ 
dom variables £(0;w) = (£i(0; ui), ..., £at($; u)) is constructed using a polynomial chaos 
(PC) decomposition. This accomplishes two goals. First, at a fixed 8, the PC decompo¬ 
sition computes a low dimensional approximation to the distribution of £(8;oj) while still 
allowing for non-Gaussianity. Second, the PC approximation gives us a way to generate ap¬ 
proximate samples from the distribution of £(0; ui) at a fixed parameterization 8. Although 
KL gives a low dimensional approximation to the orginal distribution of X{6\ui) it does not 








QOI KL Reconstruction 


Figure 5: The first three KL coefficients depicted in Figure [ 4 ] are used to reconstruct an approximation to 
the distribution of Q(/3, 7501 ). Using this approximation ailows us to reduce the dimension of the QOI. 


provide a way to generate samples from that distribution. 

In polynomial chaos, a random variable is approximated by a series of polynomial basis 
functions that form a basis for the underlying probability space of the random variable. 
It has been shown [22] that the particular choice of basis makes a significant difference 
in the rate of convergence to the random variable. We will present our methods using the 
Hermite polynomials [22], which work well for our SIR example, though the techniques used 
are independent of the basis. Different basis of polynomials, in the generalized polynomial 
chaos scheme, are orthogonal with respect to different measures. The Hermite polynomials 
are orthogonal with respect to the standard Gaussian density. This makes them ideal 
for application when the underlying distribution of the random variable being emulated is 
approximately normal. If there is reason to suspect that a distribution is far from normal 
then another basis should be considered [22]. 

The Hermite polynomial of degree k in a single dimension is denoted by ipk(x). This set 
of polynomials can be defined recursively by 

V’o(z) = 1 (4-1) 

-;/> i(x) = x 

l/j m +l(x) = X1pm(x) - m'lpm-iix) fol’ 771 = 1, 2, ... . 

Since the Hermite polynomials are orthogonal with respect to the standard normal density 

w(x) dx = j_ e -3:2 / 2 dx, 

V2tt 


.fr\ 0 ~X 2 / 2 rlnr — n !A. 


the relation 


1 


■OO 









holds for all i,j = 0,1, 2,.... The Hermite polynomials in higher dimensions are formed 
by tensor product. Let a = («i, 02 ,, ctd) G 7L d be a multi-index and define the Hermite 
polynomial on by 

'La(x) = 1 p ai (xi)^a 2 (x 2 ) ■ ■ .^a d {x d ). (4.2) 

To index the Hermite polynomials in higher dimensions, we use a graded lexicographic 
ordering {4J on the multi-indices in dimension d. That is, we will use T^(x) to refer to the 
multidimensional Hermite polynomial with the k th multi-index in the graded lexicographic 
ordering. 

With the above indexing on higher dimensional Hermite polynomials, the polynomial 
chaos decomposition of the random vector £(0;u;), up to order K, is given by 

K 

m = m w(0) ~ E c * w^fc(c)- ( 4 - 3 ) 

fc =0 


Here c k {0) is a length N vector of coefficients corresponding to each entry in £ so, for 
n = 1,2,.. ,,N, 


Zn(0) = U^(0) 


K 

E 

k=0 


Cnk(e)* k {0- 


(4.4) 


The new variable £ = (Ci> C2, ■ ■ ■ > Cn) is a random vector of independent standard normal 
random variables. This vector serves the purpose of parameterizing the probability space 
corresponding to co. That is, sampling from £ is equivalent to sampling from ui. In Figure 
[6j we show the reconstruction of the first three KL coefficients for the QOI in the stochastic 
SIR model using Equation (4.4). 


The difficulty with the expansion given by Equations (4.3)—(4.4) is the computation 


of the coefficients c k (0). These are formally defined through Galerkin projection onto the 
Hermite polynomials [22] by the formula 


Cnkifi) 


E[g n (g;o;)T fc (C)] 


(4.5) 


This formula is only formal since £ n (0;cj) and 4 t(£) live over different probability spaces. 
Computing the expectation E[£ n (0; w)4'/ c (C)] in Equation (4.5) is performed by Monte Carlo 
approximation, so sampling of £ n {Q\u) and ^kiC) must take place over the same probability 
space. To compute this expectation, we transform the two probability spaces to a common 
domain. There is a standard method to transform two different, finite dimensional, proba¬ 
bility spaces into a common space, which usually relies on having an explicit representation 
of the underlying distributions. Lacking an explicit representation, we will form an approx¬ 
imation using a kernel density estimate. 


The Rosenblatt transformation m uses the conditional cumulative distribution func¬ 
tions to map a set of jointly distributed random variables onto a set of independent uniform 
random variables on [0,1]. In terms of the conditional cumulative distribution, the Rosen- 


10 







KL coefficient, PC reconstruction 


-20 -10 0 10 20 -10 -5 0 5 10 15 



Figure 6: Reconstruction of the one and two dimensional marginal distributions for KL coefficients using 
the first eight terms in the polynomial chaos reconstruction given by Equations (4.31 (4.41. Since the 
Hermite polynomial chaos approximation is formed from smooth functions sampling from the truncated 
decomposition has the effect of smoothing the original distribution and clipping the low probability regions. 
This can be observed by comparing the above Figure with the original KL distribution in Figure |4| 


blatt transformation is given by 


ui = Ti(£i) 

= ^2|l(6l£l) 
u 3 = -^ 3 | 1 , 2 ( 6 | 6 ) 6 ) 


(4.6) 


UN = FN\l, 2 ,...,N-l{£N\£l,&, ■ ■ • ,£/v)- 

Once this map is computed, the cumulative distributions are used to generate samples 
from the joint distribution of the random vector £ from N independent samples of uniform 
random variables on [0,1]. This process is the inverse Rosenblatt transformation, which 
maps a set of N independent uniformly distributed random variables to the random vector 
£ of length N. This process uses the inverses of each of the marginal cumulative distributions 


in Equation (4.6). We will denote this inverse Rosenblatt transform by 


g( u ) = £,u~ u[o, l] 


N 


(4.7) 


Likewise it is standard to map u ~ [7(0,1]^ to independent normally distributed random 
variables £ ~ 1V(0,1)^. We denote this map by 


Z(u) = C,u~Z7[0,l] 


N 


(4.8) 


With the above maps one can then compute the expectation in the numerator of (4.5) 
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as follows 


E[£„(0;a;)* fc (C)] = — j £„(*; u)9 k (0e~^<f d( 


(4.9) 


'[o.ffi 


ge(u)^k(K u ))du. 


We note that the integral is computed using a Monte Carlo approach and we have used the 
notation < 70 ( 11 ) to indicate that the particular inverse Rosenblatt transformation depends 


on the parameter 9. As stated above, to calculate the expectation in Equation (4.9) using 


a Monte Carlo scheme one must be sure to use the same sample of the uniform random 
variable u when calculating values for gg (u) and /(u). It is also important to use a Monte 
Carlo scheme of sufficient order relative to the size of the coefficients involved in Equation 


(4.3). 


When using Equation (4.3) to characterize intrinsic uncertainty in a simulation, one does 


not have an explicit form for the conditional cumulative functions in (4.6). In practice, for 
a fixed 6, one only has a finite number of samples of the random vector £( 0 ;cj), which we 
will denote by {£( TO )}^ =1 . To estimate the conditional cumulative functions, we first use 
the samples to estimate the joint pdf of the random vector £. This can be done using 
a kernel density estimation (KDE) method 0. We use univariate tensor product kernels 
where our univariate kernel for the i th variable is denoted by iQ(£j) and our bandwidth is 
denoted by h > 0. For the derivations that follow, it is important to recall that, in KDE, 
the kernel K{ has integral equal to one. For the samples C the KDE at a 

point i = (£ 1; £ 2 , • • ■, id) is given by 


1 M 


' £ — £( m )' 


(4.10) 


m= 1 
M 




ft-ft 


(m)' 


Mh d 


K‘2 




• • • K., 




m= 1 


in the functions (4.6) 
densities, 


The goal is to now use Equation (4.10) to build the conditional cumulative distributions 


The cumulative distribution functions are built from the marginal 


Pl,...,n(^h ■ ■ ■ 1 £n) — J p {0 C^n+l ‘ ‘ ‘ 


(4.11) 


Mh 


1 M 




M' 


• • • I<n 


£n-£ 


(m) ' 


m= 1 
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This leads directly to a formula for the marginal conditional cumulative distributions 


Pn\n—l,...,l (£n|£l ? • • • ? £n—l) n 
-oo 


(4.12) 


j <n PWlfe, ■ • ■ ,jn) 

J— oo Pl,...,n— l(£l> ■ • ■ >£ra—l) 

£m=l llntlKl 


d^r 


It Kn )d£ n 


Ef= i < ik ^ 


n—1 




(m) 


Once these can be computed, the inverse Rosenblatt transform is easily accomplished. 
For each sample u ~ U[ 0, l] rf , one iteratively goes through the dimensions starting by 
computing Fi(£i) for increasing £i until Fi(£i) > U{. which fixes a . Next compute 
F 2 |i(^ 2 ^ 1 ) for increasing £2 until F 2 \\ felO) > which hxes £ 2 - This process is repeated 
d times until we arrive at a map <y(u) = £. 


Now that we have a method of computing the expectations in the polynomial chaos 
coefficient definitions (4.5), it is possible for us to build an emulator from our data that 
combine the Karhunen-Loeve decomposition in Equation (3.6) and the polynomial chaos 
expansion in Equation (4.3). This gives us an approximate model for the stochastic process 
represented in the simulation, 


N 

X(0;o;(C))^X + ^ 

n= 1 


K 


^c nfe (0)4> fc (C) If 


\k =1 


(4.13) 


The PC expansion must be computed separately for each value of our input parameter 
set, 9. The dependence of the stochastic process on the input variables is then only seen 
through the coefficients c n k{9) in the PC expansion. One important property of this em¬ 
ulation is to provide a map from the multivariate standard normal random variable ( to 
the intrinsic uncertainty in the random vector £(#;u;). This defines a probability space for 
the intrinsic uncertainty in the simulation that can be sampled quickly. Once the emulator 
is constructed, for a fixed set of inputs, many realizations may be computed to ascertain 
the approximate distributions associated with quantities of interest when only intrinsic un¬ 
certainty is present. The use of the polynomial chaos decomposition allows this intrinsic 
uncertainty to be represented in a non-Gaussian, and somewhat, non-parametric way which 
has the potential to represent very general behavior in the intrinsic variability. 


An approximation of Equation (4.13) is depicted, for the SIR example, in Figure [7J 
Again, we see the smoothing out of the QOI distribution’s finer scale features. However, 
general shape, correlation structure, and non-Gaussianity are preserved well. This approx¬ 
imation can be made exact by increasing the number of samples from the SIR model used 
to form the KL decomposition and the PC decomposition as well as including more terms 
in each of the decompositions. 
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Figure 7: Reconstruction of the one and two dimensional marginal distributions of the QOI using the KL- 
PC decomposition in (4.131. This figure should be compared with the original QOI distribution depicted in 
Figure[3] Notice that much of the original distribution structure is maintained. However, the approximation 
inevitably smooths the original distribution and cuts off some of the lower probability regions. This is due 
to a mixture of the effects of truncation in the KL-PC decomposition and the bandwidth chosen in the KDE 
used to compute the PC coefficients. 


5 Gaussian process regression on PC coefficients 


It remains to provide a surrogate model for the dependence of the stochastic process on the 
parameterization, 9. From the KL-PC approximation in (4.13) for each set of inputs, we 
arrive at a set of N-K coefficients c n k{9). We perform Gaussian process (GP) regression on 
the coefficients with the samples from sets of input parameters as data. There are many 
good reviews on the advantages of Gaussian processes in statistical modeling mm - We 
highlight two properties that are particularly important. First, a Gaussian process regres¬ 
sion is an interpolant. If the coefficients are computed at a specific 9 value and a Gaussian 
process is formed for the coefficients, then the GP coefficient values will equal the observed 
coefficients at that 9 value. Second, a Gaussian process regression fits a process to data 
through a series of observed input values, which is done in a way that ensures the variance 
in the process will grow for 8 values that are farther away from observations. This allows 
uncertainty in the emulator to emerge that is attributable to lack of observations, realiza¬ 
tions taken at too few sets of input parameters, and to describe the actual dependencies of 
the simulation on input parameters. 

To build the GP regression, we follow the methods introduced in BE], the necessary 
details are included for completeness. First, we form a vector of the coefficients, 


c(0) = ( c llW, (8), • • ■ , C 2 l(0), C22{9), ..., c N ( K _ 1 )(9),cnk(0)) t . (5.1) 

The statistical surrogate model is then built for c : MP —> M. N ' K . If the PC coefficients are 
constructed (i.e., sampled) at m parametric values ( 0 i, 02 , • • • ,0m}, then we can form the 
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(. NK ) x m data matrix, 


/ Cll(0l) 

£11(6*2) 

. Cll(0777,) 

Cl2(^l) 

£12(6*2) 

. ^12(^777.) 

\ cnk( 9 i ) 

cnk( 9 2 ) • 

. CNK( 0 m 


This is then standardized by subtracting the mean over all the samples and dividing by 
the standard deviation of all the coefficients to form C s td- This standardized data matrix is 
then decomposed using a singular value decomposition. In addition, a truncated ( NK ) xp c 
matrix of singular vectors is constructed, K = [ki, k 2 , ..., k Pc ] with E K n ' k . We truncate 
the singular vectors to keep only those corresponding to large singular values. C st( j is used 
to build an emulator, corresponding to a standardized map c st d : —> ~R N ' K , of the form 


Pc 

c st d(#) ~ v) + s (v) 

i=1 


= KW{9- rj) + 5{rj) 


(5.3) 


where W{9]rj) = {w\{ 9 \rj),... ,w Pc ( 9 -,r])) T and 5 (rj) is an independent zero mean Gaussian 
process modeling the discrepancy between the truncated decomposition and the data. This 
is taken to be 5{rj) ~ J\T{OnKt^ 1 Ink) with hyperparameter Aj. The parameter will 
control how much noise is present, we will refer to A^ as the noise precision parameter. Here 
Wi( 9 ;rj) will be p c independent zero mean Gaussian processes over the input space 9 E M p . 
These will be constructed from the p c x m matrix 


( w\{ 9 \) 

££1(6*2) . 

■ Wl ( 9 m ) 

\ 

w 2 { 9 \) 

££2(6*2) . 

■ W 2 (6* m ) 


K wpM 

w Pc { 9 2 ) . 

w Pc(9m, 

/ 


(5.4) 


coming from the truncated singular value decomposition, C s td = KW. 
Each of the Wi(9 ; rj) have covariance model [7J 


R{9,9') 


'Wi 


k=1 


4 ( e m- e ( fc) ) 2 

°Wi{k) 


(5.5) 


with hyperparameters \ Wi and p Wi = (p Wi m, ■ ■ ■ , p Wi (p)) T ■ We will choose prior distributions 
on the hyperparameters that ensure p Wi (k) £ (0,1) so that R(9, 9') decays as ||0 — 9’\ —> oo. 
The notation 9^ is used to denote the k th coordinate of 9 E W. For the Bayesian regression, 
we define the length m vector w$ = (wi(9\),Wi(92), ■ ■ ■ , Wi(9 m )) T for i = 1,2 ,p c . The 
vector Wj has covariance given by 

X m C (0-,p Wi )- ( 5 . 6 ) 


A symmetric mx m matrix is obtained by applying the covariance model (5.5) to each 
pair in the sample set { 9 \,..., 9 m }. 
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Now we define the mp c vector of all processes w 2 ; evaluated at the sample points, w = 
(w^, wj ...., wJj T , distributed as 


w ~ AT( 0 mpc , S w “l - A^ Imp c ) 

S w = diag (A wl c (9;p Wi )), 

i=l,...,p c 

the covariance matrix being size ( mp c ) x ( mp c ). 

Then the m • ( NI \) column vector of the standardized data 

^std (Cstd(^l) ; Cstd(^2) j • • • ; Cstd(^m) ) 


(5.7) 

(5.8) 


is distributed 


c s td ~ Af(O m .( M '), K S W A + A^ 


(5.9) 


■ j Im 


k p J is the (m • ( NK )) X ( mp c ) matrix formed by 


Where K = [I m <g> ki ? Im ®k 2) . 
the Kronecker product of the principle vectors. Notice that from Equation (5.3), it follows 
that K t c st d = w. 


Relations (5.3) and (5.7)-(5.9) define a likelihood coming from the density of the normal 
distribution 

L(w\X s ,X Wi ,p Wi ,i = l,...,p c ). (5.10) 

In the regression step the posterior distribution is given by 

p(^S,Xwi,Pwi,i = 1, ■ • ■ ,p c |w) oc (5.11) 

Pc ( P 'I 

L(w|Aj, X Wi ,p Wi ,i = 1,... ,p c )tt(X s ) n n ^{Pwi^k)) r • 


i =1 


k =1 


This is sampled using a Metropolis-Hastings MCMC (18] method such as a univariate 
random walk or an independence sampler. One can either take the estimated maximum 
likelihood values for the hyperparameters X$, X Wi , p w . or choose a range of values from the 
samples. 

We assume the A hyperparameters have prior gamma distributions and the p hyperpa¬ 
rameters have beta distribution priors [7], 

^(Awj) oc A““ - 1 e -b “ A ™i, i = 1,2,.. . ,p c (5.12) 

tt( P Wi (k )) oc ^“^(1 - p Wi (k)) bp ~ 1 , i = 1,2,... ,p c , k = 1,2,... ,p 
tt(X 5 ) oc A^ - 1 e- 6 * Ai . 


Once values of the hyperparameters are chosen through exploration of the posterior 
(5.11), predictions can be made for the emulated c st d(#*) at some new parameter set 0* E 
M p . Let the prediction vector, at a set of parameter values {#*, 0%, ■ ■ ■, # a }, be denoted 
c^ td = (c std (0l) T , c s td(# 2 ) T > ■ • ■ i c s td($s) T ) T j a length s-(NK) column vector. The prediction 
is based on prediction of the s ■ p c vector w* = ((w]') T , (wj) 7 ,..., (w*J T ) T , where w* = 
(wi( 0 l),Wi( 02 ), ■ ■ ■ ,Wi(0*)) T for i = 1,2,... ,p c . 
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From our definitions above we see that the data vector and prediction vector are jointly 
distributed as 


w 

w* 


1 -^"(0(s+m)p c ? ' T ^<5 I(s+m)p c ')i “ 




(5.13) 


The terms in the covariance matrix come from applying our covariance model (5.5) to each 
pair of the respective parameter sets. Thus, £ w is the m ■ p c square matrix obtained by 
applying Equation (5.5) to each pair in the sample set {Q \,..., 9 m } for each Wj. Similarly, 
£ w * is the s ■ p c square matrix obtained by applying Equation ( |5.5| ) to each pair in the 
prediction set {0*,..., 6**} for each w*. Finally, = E w * w is the (s-p c ) x ( m-p c ) matrix 

obtained by applying Equation (5.5) to each pair (9^,9*) in the sample and prediction set. 
The predictions are then distributed as follows: 


w* ~ U{p*,n*) 
p = S w * w (S w + X s Imp c ^} 


(5.14) 


-l 


w 


Q* — (E w * + \ s l I S p c ) — E w * w (E w + 1 Imp c ) 


-l 


From the predictions w*, we can define predicted standardized polynomial chaos coef¬ 
ficients using the relations 


<&d = (5.15) 

K* = [Ib ® ki, I s <8) k 2 ,..., I s <8> k p J. (5.16) 

Thus, after destandardization, we have defined a new map through Bayesian Gaussian 
process regression for the coefficients denoted by 

c(9; rj) = {c u (9;ii),c 12 (9]7]), ... ,c 2 i{9;r]),... ,c NK (9-,7i)) T . (5.17) 


The random variable p is the associated state space variable for the Gaussian process. For 
a given realization of the GP regression p is fixed. 


Once the GP is formed for the PC coefficients, we build a complete emulator of the 
intrinsic and parametric uncertainty in the simulation. The final form of the statistical 
surrogate is 


N 

X e (9 ] (,p) = X + Y, 

n= 1 


K 


J]c nfc (6»,? ? )T fc (C) f 


\k=1 


(5.18) 


In Equation (5.18), the contributions from each of the separate sources of uncertainty 
is represented explicitly. The first source of variation is the change between QOI. This 
variation is controlled in the emulator through the KL decomposition parameter r. Intrinsic 
variation is represented in X e by the standard multivariate normal random variable £ arising 
from the PC expansion. The parameter 9 occurs in the GP of the coefficients and controls 
the dependence of the emulator on the input parameters. Lastly, the use of a GP regression 
to relate PC coefficients at distinct 9 values introduces a new source of uncertainty that is 
not originally present in the model. This is uncertainty in the emulator’s dependence on 


17 










KL coefficients, GPPC reconstruction 
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Figure 8: One and two dimensional marginal distributions of the KL coefficients for the SIR QOI. These 
were reconstructed using Gaussian process regression on the polynomial chaos coefficients depicted in Figure 
[6] using equation (5.181. 


the inputs that has arisen from lack of sampling in 9. The random variable rj now quantifies 
this uncertainty, giving a way to sample from the emulator that will inform how effective 
the input evaluation scheme was. 

In using X e to compute approximate probabilities of specific outcomes from the sim¬ 
ulation, each of these variables can be sampled independently. This permits independent 
study of each source of variation in the model to quantify its effects on the simulation pre¬ 
dictions. Moreover, it allows one to study regions of the input space where surrogate model 
discrepancy is largest, i.e., where r] has the biggest contribution. This provides a framework 
to determine what simulation realizations would improve X e the most. 

We use samples of Q to compute a statistical emulator for the stochastic SIR model. 
We denote the emulation by 


4 / K 

Q e (p,r,rh()=®[Q] + EE CnkW,r,v)^k(0 

n =1 \fc=l 

/3 ~ In A 7 ~ lnAf(// 7 , cr^), C ~ Af(0 4 ,h). 


In 


(5.19) 

(5.20) 


The reconstruction of the first three KL coefficients using Gaussian process emulation of the 
PC coefficients in our KL decomposition is shown in Figure [8j Then the KL reconstruction 
can be used to approximate the original QOI random variable. The distribution of this 
approximating random variable is depicted in Figure [9j Overall the shape of the distribution 
is maintained but more importantly, by using this approximation, we have the ability to 
sample from each source of uncertainty separately. 

Recall, r\ represents the state variable of the Gaussian process c n fc. The variables (/3,7) 
can be sampled to explore the uncertainty in Q e due to uncertainty in the input parameters. 
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Figure 9: One and two dimensional marginal distributions of the SIR QOI reconstructed using our combi¬ 
nation of Karhunen-Loeve, polynomial chaos, and Gaussian process regression. This is an approximation of 
the original distribution in Figure[3] The approximation captures the overall shape and range of the original 
QOI. However, the approximation does concentrate more around the mode of the original distribution. As 
sample size increases this effect will diminish. 


Similarly, £ can be sampled to explore the uncertainty contribution due to intrinsic variation 
in the model. Finally, we can take many realizations of the Gaussian process c n k to study 
the uncertainty in the emulator Q e introduced by lack of sampling of the actual simulation 

Q. 

Since realizations of the emulator Q e are fast, we can use a Monte Carlo method com¬ 
bined with KDE to reconstruct an approximation of the distribution of Q. This distribution 
can then be used to estimate probabilities of events associated with the quantities of interest 
in Q. 


6 Conclusion 

We have discussed the problem of propagating uncertainty through a simulation that yields 
predictions effected by both intrinsic and parametric uncertainty. The term parametric 
uncertainty was used to characterize sources of variation in the simulation’s predictions for 
which the researcher knows the associated probability space and has control of how it is 
sampled when running the simulation. Intrinsic uncertainty was defined as variation in the 
simulation’s predictions that lacked an underlying parameterized probability space and/or 
a source of variation the researcher did not have control over. 

The statistical emulator constructed yielded a parametrization of the intrinsic probabil¬ 
ity space that allowed for non-Gaussianity and separated the different sources of uncertainty 
in the output. This approach included accounting for uncertainty introduced from lack of 
sampling the simulation. 
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We used a stochastic system of ordinary differential equations, representing a simple 
disease model, to illustrate our methods. We allowed the input parameters of infection rate 
and recovery rate to be random variables with lognormal distributions. The four quantities 
of interest, the peak infected percent of the population, the time to the peak of the epidemic, 
the duration of the outbreak, and the total percent of the population infected, were then 
emulated from samples. 

The emulation methods have potential application to uncertainty quantification in large 
scale computer simulations. Their utility can be evaluated by their ability to reconstruct 
joint distributions of QOI accurately from samples of the simulation. However, the em¬ 
ulation method has a large variety of tuning parameters and therefore requires human 
supervision in its application. It would, therefore, be of great utility to have reliable heuris¬ 
tics and theorems that give precise hypotheses on the types of stochastic processes for which 
the emulator will converge and the emulator’s rate of convergence, based on sampling size of 
the simulation. This is especially true in the case of stochastic processes whose distribution 
at a given parameter set depend on the parameter values. In the absence of theorems on 
convergence and rates of convergence for statistical surrogates, one would like to have a 
good suite of numerical tools at their disposal to evaluate the performance of the emulator. 
Such methods would need to be non-parametric and non-intrusive to be able to be applied 
to a large range of simulations without a priori assumptions. In the future, we will pursue 
these types of convergence results and numerical evaluation methods. 
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